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Abstract. A major current challenge poses the systematic construction of 
coarse-grained models that are dynamically consistent, and, moreover, might be 
used for systems driven out of thermal equilibrium. Here we present a novel 
prescription that extends the Markov state modelling approach to driven systems. 
The first step is to construct a complex network of microstates from detailed 
atomistic simulations with transition rates that break detailed balance. The 
coarse-graining is then carried out in the cycle space of this network. To this 
end we introduce the concept of representatives, which stand for many cycles 
with similar properties. We show how to find these cycle communities using well- 
developed standard algorithms. Removing all cycles except for the representatives 
defines the coarse-grained model, which is mapped back onto a network with far 
fewer states and renormalized transition rates that, however, preserve the entropy 
production of the original network. Our approach is illustrated and validated for 
a single driven particle. 


1. Introduction 

The reduction of degrees of freedom is arguably the most crucial step in computational 
sciences. Numerical models can be formulated at several levels of detail: from ab 
initio including the electronic degrees of freedom, to classical force fields, to effective 
coarse-grained models, up to continuum models on the macroscopic scale. Limited 
computational capacities then imply that these levels correspond to increasing length 
and time scales that can be accessed at the expense of decreasing molecular detail. 
Indeed, for many applications details of the molecular interactions are irrelevant and 
models can be devised based on symmetry considerations and conservation laws alone, 
with the Navier-Stokes equations being a prominent example for a continuum model 
of fluid dynamics. 

On intermediate levels, systematic approaches to structural coarse-graining have 
been developed 01 such as iterative Boltzmann inversion, which determines pair 
potentials for a reduced set of degrees of freedom from (partial) radial distribution 
functions obtained either from experiments or atomistic simulations. These relevant 
degrees of freedom have to be provided by the user and are typically informed by 
physical or chemical insights. One side effect of such a procedure is the loss of 
dynamical information, which is understood qualitatively (removed degrees of freedom 
act as a “bath” leading to stochastic dynamics i) but still hard to control. 

Quite a different approach is followed through the construction of Markov state 
models [4]-[^, which aim to bridge the long time scales involved in, e.g., the folding of 
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proteins from an initially disordered coil to the native state. These models are built by 
clustering microstates into a few mesoscopic states that are kinetically distinct, le., 
they correspond to basins that are separated by free energy barriers with a time-scale 
separation between fast intra-basin transitions and slow inter-basin transitions. In the 
case of a complete separation this can be exploited to endow the mesoscopic states 
with a Markovian dynamics for the slow transitions assuming quasi-equilibrium for 
the fast transitions. 

One should note that a rigorous approach to dynamically consistent coarse- 
graining has been around for quite some time using projection techniques [^|^. By 
projecting the dynamics onto the subspace spanned by the reduced variables, a non- 
Markovian generalized Langevin equation for the time evolution in this subspace is 
obtained, which is formally exact but now takes on the form of an integro-differential 
equation. The complexity of the original problem is, therefore, preserved as ones trades 
the reduction of the degrees of freedom for a memory kernel. Further approximations 
are needed to obtain a numerically tractable problem. In the case of a clear separation 
of time scales one can employ the Markovian approximation [^. But even then the 
resulting equations are highly non-trivial both from a physical and mathematical point 
of view. Within this framework attempts have been made to derive dynamical density 
functional theory 


10 and its extension to non-equilibrium situations II 12 


Indeed, another layer of complexity is added when going away from thermal 
equilibrium to driven systems. Stochastic thermodynamics has emerged as a 
comprehensive theoretical framework in particular for driven systems that are still 
in contact with a heat reservoir that itself remains in equilibrium 13 . Here two 


classes of non-equilibrium situations can be distinguished: time-dependent processes 
with dynamics that still obey detailed balance and non-equilibrium steady states 
with dynamics that explicitly break detailed balance through non-conservative forces 
and/or flows [14| . Fluctuating path functionals like work, heat, and entropy 
production obey certain symmetries called fluctuation theorems 13 , which have been 
studied for (chemical) networks of discrete states 15 ■ 17 . Hidden (unobservable) 


degrees of freedom modify the fluctuation theorems in various ways and the 

consequences of coarse-graining have been discussed for removing fast states [20], 
“bridge” states 


21 , and the clustering of microstates 22 23 


So far, only relatively simple models have been studied appealing to stochastic 
thermodynamics including a network model of kinesin with six states 21 24 and 
a model for motor proteins 25 26 . Building on these works, the purpose of the 


present paper is to extend the Markov state model approach to driven many-body 
systems and to construct coarse-grained models that are consistent with stochastic 
thermodynamics. The paper is organized as follows: in sectionwe briefly revisit the 
prerequisites, in particular how to obtain a network of discrete states from atomistic 
simulation data and introduce the relevant notions from stochastic thermodynamics. 
In particular cycles and the decomposition of networks into cycles 24 27-30 will 


play a crucial role. The actual algorithm consists of two parts: the identification and 
clustering of cycles discussed in section followed by the actual coarse-graining in 
cycle space described in section Finally, we provide some critical remarks before 
concluding. 
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In order to maintain a non-equilibrium steady state, work has to be supplied 
constantly, which is dissipated as heat q into the environment. This non-vanishing 
entropy production allows for the transport of some quantity X, which might be 
the distance travelled by a particle, the number of molecules produced in a chemical 
reaction, etc. Although the mapping to Markovian dynamics proposed in this paper is 
not rigorous, there is evidence that the preservation of entropy production also implies 
the preservation of the major dynamical properties as well as macroscopic transport. 
Indeed, very recently Barato and Seifert have conjectured that the dissipation of a 
process leading to a squared relative uncertainty = {{X — {X))^)/{Xy is at 

least 


(g) = {q)t > 


( 1 ) 


31 . We employ dimensionless quantities 


(SX)^ ’ 

where t is the duration of the process 
throughout. In particular, entropies are measured in units of Boltzmann’s constant 
fee and energies in units of k^T, where T is the temperature of the surrounding heat 
bath. 

The first step is to construct a discrete Markov state model and to determine 
the transition rates from particle-resolved simulation data. Although already this 
step is non-trivial it is not the focus of the present manuscript, in which we study a 
sufficiently simple model to employ a straightforward construction. For our purposes 
it is important to include many discrete states to approximate the entropy production 
of the particle-resolved model as closely as possible. Entropy production is related 
to cycles in the network of these discrete states. It seems quite clear that removing 
states will lead to “cutting open” some of these cycles and thus to a reduced entropy 
production 


20 . Our approach to this challenge is an additional step before the 


actual coarse-graining, which consists of identifying communities of cycles with similiar 
properties. The crucial step is then to identify a representative for each community, 
which will receive the entire entropy production of its community. All other cycles 
are then removed, and with them all states that are not visited by one of the 
representatives. This procedure preserves the entropy production rate, and Eq. Q 
therefore implies that fluctuations of (measurable) transported quantities obey the 
same lower bound in the original and the coarse-grained system. 


3. Background 

3.1. Model system: driven particle in a double well potential 

We will illustrate our approach to coarse-graining with a specific example: a Brownian 
particle in a two-dimensional symmetric potential driven by a non-conservative force 
Fnc- Working in two dimensions allows to directly visualize the configurations space 
as well as cycles, although the discussed algorithm is more generally applicable to 
many-body systems. 

The equation of motion follows as 

X = —XU -f i^nc + (2) 

where ri{t) is a random force with correlations {ri{f)rj{t')) = 25{t — t') and 

U(.x) = ^ + + x^y^) 


(3) 
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Figure 1. Contour lines of the symmetric two-dimensional potential eq. § and 
exemplary trajectory in the presence of the non-conservative force (the arrows 
indicate the direction of the driving). The system is still invariant under inversion 
with the loci of highest probability being shifted from the minima of the potential. 


is the potential energy. As non-conservative force we choose I^nc(x) = uj(—y,x)'^, 
where uj denotes the driving strength. Figure[^shows the contour lines of the potential 
and an exemplary trajectory. Due to F^c, the particle trajectory does not obey the 
symmetry of the conservative potential anymore. 


3.S. Building Markov state models, in a nutshell 


We start by giving a short overview of how to create a Markov state model from, in 
general, molecular dynamics simulations, while here we employ Brownian dynamics 
(BD) propagating eq. ([^. A detailed review can be found in 32 . The first step in 


building a Markov state model is to approximate the continuous state space (only the 
positions are taken into account) by a set of discrete states that we will refer to as 
the microstates. The spatial discretization is typically obtained by cluster analyses, 
e.g. via the fc-means algorithm. After a proper spatial discretization is found, the 
continuous d-dimensional state space is discretized into k partitions. Each of the k 
partitions is represented by its center, called centroid, allowing the original data points 
to be mapped onto the centroids by minimizing their Euclidean distance. 

Once the full state space is discretized, the BD trajectory can be projected 
onto its centroids, storing its dynamical information as a simple sequence of centroid 
indices. The dynamical information can be extracted from this sequence of centroids 
by counting the number of transitions Ci^j. Whenever centroid i is followed by j we 


increase Ci^j by one. Finally, the transition matrix T is approximated by 


Ti^j — 


a 






(4) 


which is also the maximum probability estimator for the true transition operator 32 
If At is chosen small enough, we can further expand 

T = exp (AtIF) « 1 + AtW -b 0(At2) 


(5) 
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with rate matrix W. It is important to note that for driven systems neither W 
nor T are symmetric in general. Hence, their eigenvalues are complex, making the 
identification of metastable sets difficult. The reason is that the real part of the 
spectrum does not necessarily offer a gap, an illustrative example for which can be 
found in ref. 1381 

3.3. Stochastic thermodynamics 

3.3.1. Steady state Markov models The discrete states and their corresponding 
transition rates can be represented by a graph G{V,E), where vertices V represent 
the states and edges E the transitions connecting the states. The number of vertices 
in G is denoted by \V\, while the number of edges is \E\. 

If all transition rates Wj = Wi-^j are knowrj^ the time evolution of the state 
probabilities is given by the master equation 

= XI (t) (6) 

with normalization 

i 

Here = w'^pi are the directional probability fluxes indicating the amount of 
probability flowing from state i to state j per unit time. The subsequent description 
requires that the graph spanned by the Markov model is (i) ergodic, i.e., every state 
can be reached by every other state in finite time, and (ii) the transition rates are 
reversible, i.e., if Wj > 0 then also wj > 0. Hence, there is one edge for the forward 
and one edge for the backward transition between any two states that are connected 
through non-zero transition rates. 

If the system is in a steady state the probability distribution, and hence 
probability fluxes, are time independent and we can drop the time argument, Pi{t) = 
Pi. The left hand side of equation (§ becomes zero, 

= ( 8 ) 

i¥=3 

In the following we assume that the system resides in a (non-equilibrium) steady state. 

A special steady state is identihed as thermodynamic equilibrium. Here each 
summand of the right hand side of eq. (j^ vanishes individually, i.e. — (I>^’®q _ 
while the condition = wlp^S is called detailed-balance stating that the amount 

of transported probability per unit time in the transition i j equals the amount of 
the reverse transition j —3 i. If detailed-balance is broken a net probability current 
flows from i ^ j and eq. ([^ can be identified as Kirchhoff’s current law stating that 
the same amount of probability flowing into state i is also flowing out. 

i We adopt the notation that edges are written as upper (source) and lower (destination) indices, 
i.e., Xj is the value of X taken in the transition i —>■ j. 
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3.3.2. Entropy production Following refs. [27|[34l the total average entropy production 
rate (^tot) reads 


-ij 





^3 



(9) 


(Ssys) (Smed) 

The first term is identified as the time derivative of the Gibbs-entropy Ssys = 
— Pi In Pi and thus describes the entropy change of the system itself. The second 
term represents the coupling of the system with its environment (medium) in such 
a way that the system cannot reach equilibrium. The coupling can be, for example, 
caused by a heat or particle flux flowing from the medium into the system. Only 
the sum of both entropy production rates is larger than zero in accordance with the 
second law. If the system is in a non-equilibrium steady state (NESS), the first term 
vanishes and consequently (^tot) = (5'med) balance each other. 

A second important quantity related to entropy production are affinities, or 
generalized thermodynamic forces . The edge affinity between two states is defined 
as 



which can, again, be split into two parts. The first part cr* is identified as the entropy 
produced in the medium 34 for every transition f > j. Interestingly, neither the 


affinities nor the dissipated heat depend on the time scales governing the dynamics of 
the system as in both expressions only the ratio of rates are taken into account. 

Finally, the central concept in our approach to coarse-graining is cycles. A cycle 
is an ordered set of states (vertices), at the end of which the starting state is reached 
again. Cycles that differ only in their cyclic permutation of vertices are considered 
identical. For example, {1, 2,3,1} = {2,3,1, 2} = {3,1, 2, 3} all denote the same cycle 
but {2,1, 3, 2} is a different cycle. We further distinguish between trivial cycles (cycles 
with only two different states, be., {i,j, z}), and non-trivial cycles that contain at least 
three different states. 

Two types of observable^ can be distinguished for each cycle: 

Current-like observables that are summed along the edges corresponding to each 
cycle. Consider an observable defined on the edges of a graph, that is given by a 
matrix X G The cycle observable is computed as 


(i) 


= 


E 

ijGoi 


x] 


( 11 ) 


The notation J2ijea denotes the summation over all directed edges i ^ j that 
are part of cycle a. 

(ii) State-like observables that are summed over the states forming a cycle. Here the 
observables are defined on the graph vertices given by a vector Y G and 

thus 


Fa = 

iGcx 


(12) 


§ Throughout, greek indices denote cycles. 
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An example for the latter is given by the average cycle period Ta = 

average time that is spend in cycle a) with being the average time 

spend in state i. 


3.3.3. Cycle affinities The most prominent example for a current-like observable is 
the cycle affinity 

= (13) 

ij^Oi 

It has two important properties: 


(i) The cycle affinity of trivial cycles is always zero since affinities are anti-symmetric, 

A* = -Ai, and thus = Aj Aj = Aj - A* = 0. 

(ii) All cycle affinities are independent of the state probabilities and thus can be 
expressed by (t*’s only. 

To prove (ii), 


4 r 1 — 4* 


-f ... -f A” = In 


w^Pi... w'fpn 
wjpj . . . WlPi 


= In 


'W j - ■ i 


= a) + ...+<y-. (14) 


Since the pfs are state functions, going along a cycle they cancel pairwise. A 
remarkable property of a NESS is thus that the average entropy production can 
be calculated either by using the A*’s or cr^’s, which correspond to S'tot and 5'med, 
respectively. 


4. Cycle clustering 


f.l. Cycle-flux decomposition 

For convenience, we define the indicator 

I 1 if vertex i is in cycle a 

0 otherwise 


Xa = 


and passage function 


Xj,a 


if directed edge i 
otherwise, 


j is in cycle a 


(15) 


(16) 


depending on whether a vertex i or directed edge i —)■ j is part of a cycle a, respectively. 
Following the ideas of refs. [^[^|3^ the cycle-flux decomposition expresses the flux 
field $ through a linear combination of cycles, 

(17) 






The cycle weight ipa quantifies how much probability flows through cycle a per unit 
time. As shown in refs. |35[|36[ :/3o, > 0 is non-negative and the decomposition 
eq. @ always exists if $ satisfies the steady state condition (Kirchhoff’s current 
law). However, this decomposition is not unique as discussed in the next section 4.2 
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Using the cycle observables eq. (111 and inserting the cycle decomposition eq. (©> 
the average of a current-like observable 


(^) = E = E E = E 


(18) 


can be expressed as a cycle average with weights ipa 24 . In particular, the cycle 
average of affinities 


(ki) = ^<p„ki„ = ^$jki;. = (5tot) 


(19) 


equals the total average entropy production with the definition of edge affinities 


eq. (10). Furthermore, we define the conditional average entropy production of a 


single cycle Sq, = so that the total average entropy production becomes the sum 

('S'tot) = Set 


Jj..2. Algorithm 

Several types of algorithms have been proposed in the literature to accomplish the 
decomposition eq. (17) 27 35 37. For example the “method of derived chain” 


introduced in ref. |37[ w hich is stochastic in nature and has the advantage that the cycle 
weights ipa in eq. ( |17[ ) are unique and that they correspond to the mean number of 
passages through cycle a. However, negative cycle entropies (sq, < 0) may occur, which 
greatly complicates the identification of communities (as will become clear shortly). 
Moreover, the number of cycles used in the decomposition can be orders of magnitude 
larger than for the other approaches discussed below. Another important type of cycle 
decompositions, first mentioned by Schnakenberg 27 , considers fundamental cycles 


that span a basis of the cycle space. Although the number of contributing cycles is 
as small as for the described algorithm below, the fundamental cycles are not unique 
and can have negative cycle entropies as well. 

We employ a variant of the “cycle-flux” decomposition introduced by 
MacQueen 35 . This is a deterministic algorithm that has a polynomial complexity 


in the number of vertices |U|, making it computationally affordable even for large 
graphs. Again, the decomposition (and thus the cycle weights Lpa) is not unique 
but rather depends on the initial cycle sequencing, where already a minor variation 
in the sequencing can lead to different cycle weights. In particular, some cycle 
weights become zero while others become non-zero. An example illustrating this 
arbitrariness is given in ref. where the cycle-flux decomposition is applied to the 
totally asymmetric simple exclusion process (TASEP) model. 

Let C = {a} denote the set of all possible cycles. Only the subset C" C C of cycles 
with non-vanishing cycle weights contribute to averages, an upper bound |C"| < H of 
which is given by the Betti number B = \E\ — \V\ -1-1 36 . The problem is that one 


cannot know beforehand what the contributing cycles are, meaning that theoretically 
all possible cycles are needed to ensure a complete decomposition, although most of 
them will have zero weights especially for graphs with many vertices. Alrea dy t he 
number of possible cycles of an undirected graph is bounded by H < \C\ <2^ 

We now describe the algorithm in more detail. To this end, we sort cycles. 


38 


C — {ao, ai,..., a\E\/2i ■ ■ ■ i a|C|}j 


( 20 ) 


■v* 

trivial cycles 
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Figure 2. Normalized cumulative sum eq. \22\ of cycle entropy production of 
our model system. The first red line marks the number of cycles to recover 95% 
and the second to recover 99% of the total entropy production. 


where first come the trivial cycles followed by the non-trivial cycles containing three 
or more states. The number of trivial cycles is given by half the number of edges, 
I if 1/2. Initialize $ t— $ and take the first cycle a from C": 

(i) Take all fluxes $ along cycle a and find the smallest value, which becomes the 
cycle weight ipa, 

= min{$J}. (21) 

ijGa. 

(ii) Subtract (fa from all fluxes along a, —>■ — PaXfa- The new flux field has 

at least one edge less. 

(iii) Advance to the next cycle a in C and repeat. 

(iv) The algorithm stops when the residuum ||$||max has become smaller than some 
threshold. 


After the first |if|/2 iterations, all trivial cycles are subtracted and hence the flux field 
$ at this stage contains only irreversible edges, be., if > 0 then = 0. Hence, the 
fluxes have become the (positive) currents of the original graph, , where 

> 0 since the algorithm subtracts the smaller value. The original affinities A® along 
the remaining edges are thus positive, cf. eq. (10), and hence all cycle affinities for the 
remaining non-trivial cycles are positive. An illustrative example for a concrete graph 
is presented in appendix |7.1| Note that this implies Sq, > 0 for all non-trivial cycles. 

In appendix |7.2| an algorithm is outlined how to obtain all contributing cycles C. 
Removing the trivial cycles, we can thus further reduce the set of cycles C G C to be 
considered in the following with \C\ < B — |if|/2. 


4-3. Cycle clustering 

4-3.1. Absence of dominant cycles Markov state models that are constructed from 
molecular dynamics simulations contain thousands, or even millions, of cycles. This is 
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in stark contrast to the semi-empirical models discussed so far containing a handful of 
- already coarse-grained - states 21 25 . At this stage it is important to realize that 


there are no dominant cycles in terms of the entropy production. This is illustrated in 
figure]^ for our specific model system. We have chosen \V\ = 200 (this is an input to 
the fc-means clustering algorithm) and found \E\ = 10612 edges implying B = 10413 
with non-trivial cycles \C\ < 5107. Shown is the normalized cumulative sum 






Ea 


( 22 ) 


of the first n cycles for sorted cycle entropies sq > si > ■ • ■ Already for our simple 
model system a large number of cycles contribute and no dominant cycles appear. To 
recover 95% of the total entropy production, almost 2000 cycles have to be included, 
although many of the non-trivial cycles do not contribute substantially to the overall 
entropy production. 


4-3.2. Linking cycles Our approach in the following is based on the idea that many 
cycles are similar in their length, traversed states (region in state phase), and cycle 
affinities. We will propose how to quantify this “similarity” of cycles and how to group 
these cycles together, forming cycle communities. 

The cycles in C can again be interpreted as vertices of a graph Q = (C,fl). It 
is important to realize that we have a considerable freedom in defining edge weights 
(analogous to the original transition rates Wj). Here we opt to link cycles a and /3 
using 

^ ^ (23j 

2^^P^X^,a + l^^PiXi,l3 

which is symmetric as emphasized by the subscripts. Two cycles are linked only if 
they share at least one vertex on the original graph, while the connectivity strength 
0 < flap < 1 depends on the cumulated probability of the shared states. Other 
proposals for linking cycles can be found in refs. [^[33l 

In practice, we found that the resulting graph is connected too strongly and that 
better results are achieved if H is modified using additional information. We thus cut 
a link if (i) two cycles have dissimilar cycle affinities and (ii) if their spatial extensions 
differ too much. For the later, the spatial extension of a cycle is defined as the maximal 
Euclidian distance of centroid positions 


= max \\xi-Xj\\ 2 , 

(iJ)Ga 


where Xi is the state-space vector representing centroid i. Hence, 


0 


^a/3 ^ » 0 

^a/3 


• r min{^„,^/3} ^ c 
maxima ,^/3} — 

otherwise 


with f,a,fd G (0,1). Specifically, for our model system we have set fa 
resulting cycle graph is shown in figure 




(24) 


(25) 


i. The 
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Figure 3. Cycle graph constructed from our model system. Each vertex 
represents a non-trivial cycle. Colors indicate cycles belonging to the same cycle 
community. The vertices are arranged utilizing a force-directed graph drawing 
algorithm. 


4 . 3 . 3 . Cycle communities Having defined H to be symmetric and thus undirected, we 
can make use of well-developed community finding algorithms for undirected graphs. 
Formally, the communities are disjoint sets C; of the non-trivial cycles. 


c = Ua 


(26) 


A community is a group of vertices that share similar properties and are consequently 
stronger connected (higher link density) to vertices inside their community than with 
vertices from other communities. The number of communities is inherent to its graph 
and not preassigned. A review of the most important algorithms can be found in 
ref. |39[ In general, a measure of how good community finding algorithms perform 
is the modularity function Q, which measures the link density inside communities as 
compared to other communities by assigning it a value between -1 and 1 
defined as 


40 . It is 


Q=—y 

^ 2m ^ 
0.0 




with kp = ^a/3 and m=\ ^p. 



if a and /3 belong to same community 
otherwise 


(27) 


Specifically, we use a community finding algorithm 41 that has been shown to be 


fast and achieves good results also for a large number of cycles (vertices in the cycle 
space). Here we are interested in finding communities among all non-trivial cycles 
that were found by applying the cycle-flux decomposition algorithm described in the 
previous section |4.2[ The communities found for our model system are displayed in 
hgurej^in cycle space and in figure]^ for the original state space. Three communities 
have been detected: The red and green colored communities are in agreement with 
the loci of highest probability, cf. figure while the blue community contains cycles 
connecting these two loci. This result agrees well with the intuitive picture of entropy 
production due to the interplay between conservative and non-conservative forces for 
the particle trapped in a basin, with rare transitions between both basins. 
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Figure 4. Cycle communities in state space, where edges are colored according 
to the community of their cycles, cf. figure]^ Three communities with red, green, 
and blue have been identified by the community finding algorithm. 


5. Coarse graining 


5.1. Cycle representatives 


We have now identified communities of cycles with similiar properties determined 
through the link strength However, it is not possible to compute something 

like an average cycle because the cycle space does not have any physical metric. 
Furthermore, we are restricted to the original states and transition rates (or rather 
(T*’s) as they have a physical meaning, be., position in state space and medium entropy 
production per transition, respectively. To overcome this problem, we determine 
one cycle out of each cycle community, which we will refer to as the (community) 
representative. This cycle is then rescaled in order to preserve the entropy production 
of the whole community, and the other cycles are discarded. 

To this end, we rewrite the modularity function eq. (271 as the sum over all 
community modularities 






Cl) Cl)' 

2 m 


(28) 


where I denotes the Tth community. Our task is to find the cycle 7 for each cycle 
community I that maximizes Qi — Qi\j, with Qi\.y being the modularity of the Tth 
community without cycle 7 . In other words, we want to identify the cycle 7 for each 
community that has the biggest impact on the community modularity. In particular, 
Qi\.y increases if cycle 7 matches poorly and decreases when it provides many links to 
other cycles inside its community. After some algebra, the difference is given by 


Qi — Qi\~i 



m 





(0 


m 


(29) 


For our model system, the three representatives found from maximizing the modularity 
difference are illustrated in figure 
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Figure 5. Illustration of the cycle representatives for each community of our 
model system. The centroids belonging to a representative are colored according 
to the communities in figure whereas the cyan states highlight the crossings 
of representatives. The gray points indicate the centroids not belonging to a 
representative, which are thus absent in the coarse-grained model. 


5.2. Rescaling 


After determining the representatives, the original fluxes <i>* and transition rates Wj 
have to be rescaled. The physical constraints are: 


(i) The total entropy production (Stot) is preserved. 

(ii) The ct® ’s of all non-vanishing edges are preserved. 

(iii) The cycle affinities Aa of all contributing cycles are preserved. 


It is easy to check that (iii) is valid if (ii) is fulfilled, cf. eq. (14). For every cycle 
community, we define a community entropy production 


Si= Sa with (,5tot) = 

ctGCi I 


(30) 


Because all non-trivial, and thus possibly entropy producing, cycles are partitioned 
into communities, the total entropy production of the original graph is the sum of all 
community entropy productions. In what follows, all rescaled quantities are labeled 
by hats, e.g., ipa tpa- Moreover, the subscript I to cycle quantities denotes the cycle 
representative of the Z-th community, e.g., si is the cycle entropy production. 

We now identify the new entropy production of each representative cycle s; with 
the entropy production of its community Si. By making use of (iii), we compute the 
new cycle weight ipi as 

Si = Si = Ai(pi => ipi = > Q. (31) 

Ai 

The crucial coarse-graining step consists of removing all other non-trivial cycles by 
setting their weights (pa to zero. Because the entropy production of trivial cycles is 
always zero, {Stot) is preserved and (i) fulfilled. All states that are not part of a 
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community representative are thus removed with the remaining states V constituting 
the coarse-grained Markov state model, see figure 

All cycles in the coarse-grained model are either a representative or trivial. We 
now show that the weights of the remaining trivial cycles are always positive and the 
coarse-grained model thus still constitutes a valid cycle-flux decomposition 




(32) 


for the rescaled fluxes 4), cf. eq. (17). This can be achieved by demanding that all 
non-vanishing edge affinities Aj are preserved. By virtue of eq. (17), the new cycle 
weights ipa. can then be computed from 


exp A!. = ^ = a = ^ 

' <^1 <^1 T 


ggC' ^aXj,a 

aeC ‘f’o:Xi,a 


which can be rearranged to 




(33) 


(34) 


aec 


We now pick one edge i —>■ j for which the edge affinity A* > 0 is positive. We then 
split the sum over all cycles into a sum over trivial and non-trivial cycles, 

0 = <,3^(exp A] - 1) + E exp A] - yj „) (35) 

age 

= ippiexpA] - 1) - E (36) 

i 


For every edge there is exactly one trivial cycle, here denoted ft, for which y® ^ = 
= 1. As explained in section 4.2 for positive affinity all non-trivial cycles are 


xip _ 

oriented to follow the net flow, which implies xl a =0. The remaining sum over all 
non-trivial cycles participating in edge i ^ j reduces to the weight of the representative 
cycles since we have set the weight of all other non-trivial cycles to zero. We have 
thus determined the remaining weights tpp > 0 oi the trivial cycles, which clearly are 
positive. 

The final step is to obtain new probabilities pi for the remaining states. We 
simply scale out the probability of the removed states implying Pifpj = Pi/pj with 


normalization 
transition rates 


Pi = 1. Moreover, this is sufficient to preserve the ratio of 


exp CT,- = ^ = 


P^ 


Pz 


w 


(37) 


fulfilling condition (ii). Together with the rescaled fluxes we thus obtain transition 
rates w)® = /pi, which completes the formulation of the coarse-grained Markov state 
model. 


5.3. Numerical test 

To test the accuracy of the coarse-grained model, we compute the forward and 
backward rate of traversing between both basins for the continuous BD trajectory 
and our reduced Markov state model. The mesoscopic rates for the reduced Markov 
state model have been calculated using a kinetic Monte-Carlo scheme. Figure]^ shows 
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Time 


Figure 6. Rate computation for the transition between both minima. The 
normalized histograms show the distributions of the times needed for the specific 
transition, computed for the reduced Markov state model. The blue dotted line 
shows the exponential fit to obtain the mesoscopic transition rate, while the red 
line is the fit using the histogram computed by the BD trajectory. 


the normalized histograms and the exponential fit (dotted blue line) obtained for the 
coarse-grained model as well as the exponential fit obtained from the original BD 
simulations (red line). Due to the symmetry of the forces, these rates should be equal, 
which is recovered by the coarse-grained model. Moreover, the numerical values of 
both rates are in excellent agreement with the BD results (see table , illustrating 
that our coarse-graining method indeed preserves the mesoscopic time scales of the 
BD simulation. 


Table 1. Mesoscopic transition rates for the model system for the original 
Brownian dynamics (BD) and the coarse-grained (CG) model. 



BD 

CG 

left to right 

0.029(1) 

0.028(4) 

right to left 

0.029(5) 

0.028(3) 


6. Critical remarks and conclusion 


There are at least two steps within the approach described here that will require 
further clarification and investigation. The first issue is how to find optimal cycle 
communities. Detecting communities on a graph is, in principle, already a challenge 
on its own as the vast number of heuristic approaches have shown in the past. Also, 
it is still a matter of discussion whether the detected communities ought to fully or 
only partially partition the graph. The latter is also known as fuzzy partitioning, the 
advantage of which is that not all cycles are assigned to a specific community as some 
might be matching only poorly with any of the communities. To further improve the 


community detection it is possible to define the link strengths Vlap given in eq. (251 
somewhat differently by assigning directions to the cycle edges. In any case, once 
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Figure 7. Example for the cycle-flux decomposition. The numbers next to the 
arrows are the numerical values for the fluxes <I> along edges: (a) Initial graph 
with # = <I>, (b) after removing the trivial cycles, and (c) last remaining cycle. 


obtained the cycle communities should be checked for consistency. Of course, it is 
not possible to visualize the cycles by plotting them in configuration space for more 
than two dimensions. One way would be to project the cycles onto suitable reaction 
coordinates. 

The second issue is how to find suitable community representatives. Here we 


have formulated an optimization problem with an objective function, eq. (29), but 


it might be worthwhile to also explore other strategies. It seems to be of particular 
importance that the cycle affinities of the representatives are, in some way, representing 
the averaged community affinity and their internal time scales since they will govern 
the mesoscopic time scales of the coarse-grained model. Especially the latter is part 
of our ongoing investigation. 

To conclude, we have presented a simple and scalable algorithm to construct a 
Markov state model in non-equilibrium that preserves entropy production and cycle 
affinities and, therefore, the characteristics of macroscopic non-equilibrium transport. 
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7. Appendix 


7.1. Cycle-flux decomposition for a simple graph 

Here we want to give an example how to compute the cycle decomposition described 
in section 4.2 In figure [^a) a simple, reversible and connected graph is shown 
with the vertices A,B,C,D. The arrows represent the edges and the numbers 
corresponding to the fluxes flowing in and out of each state. For example, the 
probability flux = 4 and the reverse flux <I>^ = 1. It is easy to check that 
Kirchhoff’s current law is valid, e.g., the flux into state A equals the flux flowing out, 
= 5-|-l + l = 7 = l-|-4-|-2 = According to the Betti number, the maximal 
number of cycles needed for the cycle-flux decomposition is < 7. In the first step all 
trivial cycles (the detailed-balance part of the graph) are subtracted (see hgurej^b)). 
All trivial cycles found and their weights ip a. are listed in table The only remaining 
cycles are {A,C,D,A} and {A, H,ZI,A}, which are subtracted in step (b) and (c), 
respectively. Overall, 7 cycles are needed to complete the decomposition but only 
the last two cycles contribute to the mean entropy production, which is, according to 
eq. given by (:Stot) = I • In 15 -I- 3 • In 50 « 6.62. 







Cycle representatives for the coarse-graining of driven systems 


17 


Table 2. Cycles and cycle weights for the graph shown in figure]^ 


step 

cycles 

weights ipa 

cycle affinity A^ 

(a) 

{A,B,A} 

I 

0 


{A, 0,^1 

I 

0 


{A,D,A} 

I 

0 


{B,D,B} 

2 

0 


{G,D,G} 

2 

0 

(b) 

{A,C,D,A} 

I 

In 15 

(c) 

{A,B,D,A} 

3 

In 50 


7.2. Contributing cycles 


In this appendix, we explain how to effectively obtain the cycles that are used for 
the cycle-flux decomposition without first finding all possible cycles. We make use 
of a widely-used method in graph theory 27 . For every connected graph GiV, E) a 


spanning tree T{G) can be constructed, which is the set V of all vertices connected by 
\V\ — 1 edges. We further demand that the edge directions in T{G) are the same as in 
G. The V edges of G that are not part of T{G) are called chords, with v = |£'| — |F|-|-1. 
Again, it is important to preserve the original direction of each chord. 

After subtracting all trivial cycles, the new flux field $ is obtained, which 
contains only irreversible edges. For this graph the spanning tree T(cE>) is constructed. 
Adding one chord completes one non-trivial cycle. We determine all non-trivial cycles 
correspon ding to the chords and again apply the cycle-flux decomposition as outlined 
leading to a new flux field $. If fluxes remain, we repeat this procedure 


4.2 


in section 

by which iteratively new cycles are added. 
The complete algorithm reads as follows: 


(i) Apply cycle-flux decomposition for all trivial cycles. 

(ii) Create oriented spanning tree r($). 

(iii) Identify directed chords and find corresponding cycles. 

(iv) Apply cycle-flux decomposition to the newly identified cycles. 

(v) Check if remaining flux field ||$||max < threshold, otherwise repeat with step (ii). 
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